This notebook shows a few very simple steps with Spark R
In [1]:
# Load the SparkR package.
# It will likely show a few warnings about functions that the package overrides
library(SparkR)
In [4]:
# In the IRkernel we do not have an automatically created Spark Context, as in Python & Scala.
# We need to initialize the kernel to fetch one. That takes a few moments.
sc <- sparkR.init( "local[*]" );
In [5]:
# Once we have it, we can also obtain an SQL context
sqlContext <- sparkRSQL.init(sc)
In [6]:
# Do something to prove it works
# Load one of the standard datasets that come pre-packaged with R
data(iris)
# Turn the dataset into an SparkR DataFrame
df <- createDataFrame(sqlContext, iris)
# Inspect it
head( filter(df, df$Petal_Width > 0.2) )
Out[6]:
In [ ]:
# sc is an existing SparkContext.
# hiveContext <- sparkRHive.init(sc)
In [ ]: